Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare and show gRPC usage in CLI #206

Merged
merged 27 commits into from
Nov 4, 2021

Conversation

kradalby
Copy link
Collaborator

This PR prepares and demonstrates the upcoming work to get the CLI communicating with the server via gRPC instead of modifying the database.

It sets up:

  • Authentication (currently deny everything but localhost)
  • Prepare for API keys to be used as initial authentication for gRPC/HTTP api for clients
  • Lint and restructure proto files after buf's suggestions
  • Add a demonstration of create, delete and list namespaces with gRPC
    • Those were the easiest commands that would also use the integration tests.

I need help testing this in the wild, please help me set this up on servers and actually test that it behaves like you would expect.

  • Use the CLI to create namespaces with a running server
  • test with and without Certs (lets encrypt and others)
  • anything I could have forgotten.

This commit sets up the API and gRPC endpoints and adds authentication
to them. Currently there is no actual authentication implemented but it
has been prepared for API keys.

In addition, there is a allow put in place for gRPC traffic over
localhost. This has two purposes:

1. grpc-gateway, which is the base of the API, connects to the gRPC
   service over localhost.
2. We do not want to break current "on server" behaviour which allows
   users to use the cli on the server without any fuzz
app.go Outdated Show resolved Hide resolved
This commit changes the way CLI and grpc-gateway communicates with the
gRPC backend to socket, instead of localhost. Unauthenticated access now
goes on the socket, while the network interface will require API key (in
the future).
Copy link
Contributor

@cure cure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried running this version and experimented with the namespaces cli commands. Both client and server are running this branch.

namespaces list => works
namespaces create => timeout:

2021-10-31T14:37:01Z DBG HEADSCALE_ADDRESS environment is not set, connecting to unix socket. socket=/var/run/headscale.sock
2021-10-31T14:37:01Z TRC .../headscale/cmd/headscale/cli/utils.go:371 > Connecting via gRPC address=/var/run/headscale.sock
2021-10-31T14:37:06Z FTL Could not connect: context deadline exceeded error="context deadline exceeded"

The unix socket file exists.

namespaces rename => works
namespaces destroy => not tested yet

Related: after I switched to this branch as the server, my tailscale clients are now all listed as offline in headscale nodes list, and the tailscale logs of the clients say

Oct 31 10:32:28 clientname tailscaled[3491991]: Received error: PollNetMap: Post "https://x.x.x.x/machine/REDACTED/map": http: server gave HTTP response to HTTPS client

@@ -304,6 +316,69 @@ func getHeadscaleApp() (*headscale.Headscale, error) {
return h, nil
}

func getHeadscaleGRPCClient() (apiV1.HeadscaleServiceClient, *grpc.ClientConn) {
// TODO(kradalby): Make configurable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to just pass in a context here. That way the caller can either just use the background context if they don't care, or set a timeout etc as desired on a context and pass that in.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense, will do.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, that makes a lot more sense.

@kradalby
Copy link
Collaborator Author

kradalby commented Oct 31, 2021

I've tried running this version and experimented with the namespaces cli commands. Both client and server are running this branch.

namespaces list => works
namespaces rename => works
namespaces destroy => not tested yet

These should work as they will be moved in an upcoming PR

namespaces create => timeout:

2021-10-31T14:37:01Z DBG HEADSCALE_ADDRESS environment is not set, connecting to unix socket. socket=/var/run/headscale.sock
2021-10-31T14:37:01Z TRC .../headscale/cmd/headscale/cli/utils.go:371 > Connecting via gRPC address=/var/run/headscale.sock
2021-10-31T14:37:06Z FTL Could not connect: context deadline exceeded error="context deadline exceeded"
The unix socket file exists.

hmm that is interesting, I will try to replicate locally, I got it working in docker and a simple "unrestricted" machine.

Could you check if it could be a permission issue with the socket? Either the socket itself, or if it has some systemd (or similar) limitations?

Related: after I switched to this branch as the server, my tailscale clients are now all listed as offline in headscale nodes list, and the tailscale logs of the clients say

Oct 31 10:32:28 clientname tailscaled[3491991]: Received error: PollNetMap: Post "https://x.x.x.x/machine/REDACTED/map": http: server gave HTTP response to HTTPS client

Will investigate this as well

@kradalby
Copy link
Collaborator Author

Related: after I switched to this branch as the server, my tailscale clients are now all listed as offline in headscale nodes list, and the tailscale logs of the clients say

Oct 31 10:32:28 clientname tailscaled[3491991]: Received error: PollNetMap: Post "https://x.x.x.x/machine/REDACTED/map": http: server gave HTTP response to HTTPS client

I think this should be sorted.

2021-10-31T14:37:01Z DBG HEADSCALE_ADDRESS environment is not set, connecting to unix socket. socket=/var/run/headscale.sock
2021-10-31T14:37:01Z TRC .../headscale/cmd/headscale/cli/utils.go:371 > Connecting via gRPC address=/var/run/headscale.sock
2021-10-31T14:37:06Z FTL Could not connect: context deadline exceeded error="context deadline exceeded"
The unix socket file exists.

I managed to replicate this, now I just need to figure out why.

@kradalby
Copy link
Collaborator Author

2021-10-31T14:37:01Z DBG HEADSCALE_ADDRESS environment is not set, connecting to unix socket. socket=/var/run/headscale.sock
2021-10-31T14:37:01Z TRC .../headscale/cmd/headscale/cli/utils.go:371 > Connecting via gRPC address=/var/run/headscale.sock
2021-10-31T14:37:06Z FTL Could not connect: context deadline exceeded error="context deadline exceeded"
The unix socket file exists.

This should be addressed now. Please have another try @cure.

It was caused by the socket using the same TLS settings as the network socket.

@kradalby kradalby requested a review from cure November 1, 2021 19:55
@cure
Copy link
Contributor

cure commented Nov 2, 2021

2021-10-31T14:37:01Z DBG HEADSCALE_ADDRESS environment is not set, connecting to unix socket. socket=/var/run/headscale.sock
2021-10-31T14:37:01Z TRC .../headscale/cmd/headscale/cli/utils.go:371 > Connecting via gRPC address=/var/run/headscale.sock
2021-10-31T14:37:06Z FTL Could not connect: context deadline exceeded error="context deadline exceeded"
The unix socket file exists.

This should be addressed now. Please have another try @cure.

It was caused by the socket using the same TLS settings as the network socket.

Hmm, now the headscale service won't start anymore for me:

systemd[1]: Started Headscale.
headscale[1185995]: panic: listen unix /var/run/headscale.sock: bind: address already in use
headscale[1185995]: goroutine 1 [running]:
headscale[1185995]: github.com/juanfont/headscale.(*Headscale).Serve(0xc0003b6580, 0x0, 0x0)
headscale[1185995]:         /personal/projects/headscale-krdalby/headscale/app.go:352 +0x1f66
headscale[1185995]: github.com/juanfont/headscale/cmd/headscale/cli.glob..func22(0x199f740, 0x1a636a8, 0x0, 0x0)
headscale[1185995]:         /personal/projects/headscale-krdalby/headscale/cmd/headscale/cli/server.go:25 +0xa5
headscale[1185995]: github.com/spf13/cobra.(*Command).execute(0x199f740, 0x1a636a8, 0x0, 0x0, 0x199f740, 0x1a636a8)
headscale[1185995]:         /personal/projects/go/pkg/mod/github.com/spf13/[email protected]/command.go:860 +0x2c2
headscale[1185995]: github.com/spf13/cobra.(*Command).ExecuteC(0x199ed40, 0x0, 0x1a329c0, 0x0)
headscale[1185995]:         /personal/projects/go/pkg/mod/github.com/spf13/[email protected]/command.go:974 +0x375
headscale[1185995]: github.com/spf13/cobra.(*Command).Execute(...)
headscale[1185995]:         /personal/projects/go/pkg/mod/github.com/spf13/[email protected]/command.go:902
headscale[1185995]: github.com/juanfont/headscale/cmd/headscale/cli.Execute()
headscale[1185995]:         /personal/projects/headscale-krdalby/headscale/cmd/headscale/cli/root.go:25 +0x31
headscale[1185995]: main.main()
headscale[1185995]:         /personal/projects/headscale-krdalby/headscale/cmd/headscale/headscale.go:80 +0x2cb

The /var/run/headscale.sock file exists and there is no other headscale process running.

@kradalby
Copy link
Collaborator Author

kradalby commented Nov 2, 2021

The /var/run/headscale.sock file exists and there is no other headscale process running.

Does it start if you clear out / delete the sock file before you start it?

I cant seem to get that behaviour when running with systemd, but I get it in macOS when running for dev purposes.

I am not sure what is go should clear it up vs things like systemd.

@kradalby
Copy link
Collaborator Author

kradalby commented Nov 2, 2021

Added handling for the socket, please try again, but you might have to remove the socket if it exist already.

@cure
Copy link
Contributor

cure commented Nov 2, 2021

The /var/run/headscale.sock file exists and there is no other headscale process running.

Does it start if you clear out / delete the sock file before you start it?

I cant seem to get that behaviour when running with systemd, but I get it in macOS when running for dev purposes.

I am not sure what is go should clear it up vs things like systemd.

You called it! Manually removing the socket file before starting headscale resolved the problem.

I can also confirm that the issue with headscale namespaces create is fixed. headscale namespaces destroy also works. And the issue with nodes listing offline is also gone.

@kradalby
Copy link
Collaborator Author

kradalby commented Nov 2, 2021

Ok, after looking around the internet, go does not seem to have any automatic handling, so I have added a routine waiting for different system calls to clear up the socket before we close the process. If we have other errors, we might have to clean up manually.

This should bring it into ready shape :)

@kradalby
Copy link
Collaborator Author

kradalby commented Nov 4, 2021

@cure Do you feel happy about this? Would you be up for approving? :)

Copy link
Contributor

@cure cure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it's good to merge, thanks!

@kradalby kradalby merged commit 1c530be into juanfont:main Nov 4, 2021
@kradalby kradalby deleted the initial-api-cli-work branch March 2, 2022 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants